Spark v2 brainstorming

Prioritization

  1. Figure out reverse IPNI lookup
  1. DDO & realtime deal ingestion
  1. (everything else)

FDS Summit

A talk about Spark

  • Spark v1 - how it works now
  • How the landscape changed (Allocator pathways, DDO)
  • Spark v2 - what we want to build next

Additional resources

Allocator Pathways

Stop using datacaps.io, start using on-chain events. (The verified registry actor.)

Example Open Data Pathway allocation:

Quoting from https://space-meridian.slack.com/archives/C076S55SPTN/p1718009460976999:

Here’s a verified deal:

https://filfox.info/en/message/bafy2bzaceamgpi2pnow6kfheswoapp3qti7ko2agag6hwke5zzskfyadyl2gk?t=1

You can see the sub-transactions to f07 for updating datacap balance.The client/allocator is

https://datacapstats.io/clients?filter=f1iyihonsjs44ke5ftorilthinv22s3bjvzkv2v3i

- i don’t have an easy index to find the specific ‘AddVerifiedClient’ message

List of allocators:

https://allocator.tech/

DDO

https://github.com/filecoin-project/FIPs/discussions/730

Realtime updates

Listen for on-chain events to ingest new (DDO/f05) deals as they are being created.

Example transaction (sector-activation event is emitted by all deals, claim is emitted by FIL+ deals):

Related threads:

Reverse IPNI lookup

For DDO deals - how can we map PieceCID to a retrievable payload CID?

  • In boost - make sure Context ID is created in a way that can be reproduced by 3rd parties like Spark
  • In IPNI - expose an endpoint to list payload CIDs included in an advertisement with the given Context ID

Tasking & committees

We asked ChatGPT about the smallest committee size we need to get 95% confidence in the majority being honest.

  • How many balls do I need to draw out of a bag of black and white balls, until there is a majority of white balls in my hand, assuming originally there are 60% of white balls?

    ChatGPT

    Given that 60% of the balls in the bag are white, we can use probabilistic methods to determine the number of balls you need to draw to ensure a majority of white balls in your hand.

    Let's denote:

    • pW=0.6p_W = 0.6pW=0.6 as the proportion of white balls.
    • pB=0.4p_B = 0.4pB=0.4 as the proportion of black balls.
    • kkk as the number of balls drawn.

    We want to find the minimum kkk such that the probability of having more than half of the kkk balls as white balls is very high.

    Approach:

    To achieve a majority, more than half of the drawn balls need to be white. For kkk balls, this means having more than k2\frac{k}{2}2k white balls. We can use the binomial distribution to calculate the probabilities.

    We can express the requirement as finding the smallest kkk such that:

    P(X>k2)P(X > \frac{k}{2})P(X>2k)

    where XXX is the number of white balls drawn out of kkk, and X∼Binomial(k,0.6)X \sim \text{Binomial}(k, 0.6)X∼Binomial(k,0.6).

    Calculation:

    1. Determine the condition:
      For kkk draws, you need more than k2\frac{k}{2}2k white balls, i.e., X>k2X > \frac{k}{2}X>2k.
    1. Use the binomial cumulative distribution function (CDF):
      The probability that XXX is greater than k2\frac{k}{2}2k can be found using the CDF of the binomial distribution. We want this probability to be very high (close to 1).

    Step-by-step Calculation:

    We use the normal approximation to the binomial distribution because calculating exact binomial probabilities for large kkk can be complex.

    For the normal approximation:
    X∼N(μ,σ2)X \sim N(\mu, \sigma^2)X∼N(μ,σ2)
    where
    μ=kpW\mu = kp_Wμ=kpWσ2=kpW(1−pW)\sigma^2 = kp_W(1 - p_W)σ2=kpW(1−pW)

    We want P(X>k2)P(X > \frac{k}{2})P(X>2k) to be very high (say at least 95%).

    1. Determine kkk using normal approximation:

    Using the normal approximation,
    μ=kpW=0.6k\mu = kp_W = 0.6kμ=kpW=0.6kσ=kpW(1−pW)=0.6k×0.4=0.24k\sigma = \sqrt{kp_W(1 - p_W)} = \sqrt{0.6k \times 0.4} = \sqrt{0.24k}σ=kpW(1−pW)=0.6k×0.4=0.24k

    For the condition P(X>k2)P(X > \frac{k}{2})P(X>2k), convert this to the z-score:

    P(Z>k2−0.6k0.24k)P\left( Z > \frac{\frac{k}{2} - 0.6k}{\sqrt{0.24k}} \right)P(Z>0.24k2k−0.6k)

    For a 95% confidence level (using z-scores for normal distribution, where the corresponding z-score is 1.645):

    k2−0.6k0.24k≥1.645\frac{\frac{k}{2} - 0.6k}{\sqrt{0.24k}} \geq 1.6450.24k2k−0.6k≥1.645

    Simplifying this inequality:

    k/2−0.6k0.24k≥1.645\frac{k/2 - 0.6k}{\sqrt{0.24k}} \geq 1.6450.24kk/2−0.6k≥1.6450.5k−0.6k0.24k≥1.645\frac{0.5k - 0.6k}{\sqrt{0.24k}} \geq 1.6450.24k0.5k−0.6k≥1.645−0.1k0.24k≥1.645\frac{-0.1k}{\sqrt{0.24k}} \geq 1.6450.24k−0.1k≥1.645−0.1k≥1.6450.24-0.1 \sqrt{k} \geq 1.645 \sqrt{0.24}−0.1k≥1.6450.24−0.1k≥0.8054-0.1 \sqrt{k} \geq 0.8054−0.1k≥0.8054k≤0.80540.1\sqrt{k} \leq \frac{0.8054}{0.1}k≤0.10.8054k≤8.054\sqrt{k} \leq 8.054k≤8.054k≤64.87k \leq 64.87k≤64.87

    Therefore, to ensure with high probability that you have a majority of white balls, you need to draw at least 65 balls from the bag.


    How many balls do I need to draw out of a bag of black and white balls, until there is a majority of white balls in my hand, assuming originally there are at least 51% of white balls?

    ChatGPT

    Given that at least 51% of the balls in the bag are white, we want to determine the number of balls (kkk) you need to draw to ensure a majority of white balls in your hand with high probability.

    Approach:

    To achieve a majority, more than half of the drawn balls need to be white. For kkk draws, this means having more than k2\frac{k}{2}2k white balls. We can use the normal approximation to the binomial distribution for the calculation, given the proportions and desired confidence level.

    Let’s denote:

    • pW=0.51p_W = 0.51pW=0.51 as the proportion of white balls.
    • pB=0.49p_B = 0.49pB=0.49 as the proportion of black balls.

    We want the probability P(X>k2)P(X > \frac{k}{2})P(X>2k) to be very high (close to 1), where XXX is the number of white balls drawn out of kkk.

    Calculation:

    1. Mean and Variance:
      For a binomial distribution X∼Binomial(k,pW)X \sim \text{Binomial}(k, p_W)X∼Binomial(k,pW),
      • The mean μ=kpW

        μ=kpW\mu = kp_W

      • The variance σ2=kpWpB

        σ2=kpWpB\sigma^2 = kp_Wp_B

      • The standard deviation σ=kpWpB

        σ=kpWpB\sigma = \sqrt{kp_Wp_B}

    1. Normal Approximation:
      For large kkk, the binomial distribution can be approximated by a normal distribution N(μ,σ2)N(\mu, \sigma^2)N(μ,σ2).
    1. Determine kkk:
      We need to find kkk such that the probability P(X>k2)P(X > \frac{k}{2})P(X>2k) is very high. Using the normal approximation,X∼N(μ,σ2)μ=kpWσ=kpWpBP(X>2k)≈P(Z>σ2k−μ)

      X∼N(μ,σ2)X \sim N(\mu, \sigma^2)

      μ=kpW\mu = kp_W

      σ=kpWpB\sigma = \sqrt{kp_Wp_B}

      To ensure a majority of white balls:

      P(X>k2)≈P(Z>k2−μσ)P\left( X > \frac{k}{2} \right) \approx P\left( Z > \frac{\frac{k}{2} - \mu}{\sigma} \right)

      where ZZZ is the standard normal variable.

      We want this probability to be at least 95%, corresponding to a z-score of approximately 1.645.

    1. Setting up the Inequality:kpWpB2k−kpW≥1.645k⋅0.51⋅0.49k/2−0.51k≥1.6450.2499k0.5k−0.51k≥1.6450.2499k−0.01k≥1.645−0.01k≥1.6450.2499−0.01k≥1.645×0.4999−0.01k≥0.822k≤0.010.822k≤82.2k≤(82.2)2k≤6757.44

      k2−kpWkpWpB≥1.645\frac{\frac{k}{2} - kp_W}{\sqrt{kp_Wp_B}} \geq 1.645

      Simplifying,

      k/2−0.51kk⋅0.51⋅0.49≥1.645\frac{k/2 - 0.51k}{\sqrt{k \cdot 0.51 \cdot 0.49}} \geq 1.645

      0.5k−0.51k0.2499k≥1.645\frac{0.5k - 0.51k}{\sqrt{0.2499k}} \geq 1.645

      −0.01k0.2499k≥1.645\frac{-0.01k}{\sqrt{0.2499k}} \geq 1.645

      −0.01k≥1.6450.2499-0.01 \sqrt{k} \geq 1.645 \sqrt{0.2499}

      −0.01k≥1.645×0.4999-0.01 \sqrt{k} \geq 1.645 \times 0.4999

      −0.01k≥0.822-0.01 \sqrt{k} \geq 0.822

      k≤0.8220.01\sqrt{k} \leq \frac{0.822}{0.01}

      k≤82.2\sqrt{k} \leq 82.2

      k≤(82.2)2k \leq (82.2)^2

      k≤6757.44k \leq 6757.44

    Therefore, you need to draw at least 6758 balls to ensure, with high probability, that you have a majority of white balls in your hand, given that the proportion of white balls is at least 51%.

  • How many balls do I need to draw out of a bag of black and white balls, until there is a majority of white balls in my hand, assuming originally there are at least 66% of white balls?

    How many balls do I need to draw out of a bag of black and white balls, until there is a majority of white balls in my hand, assuming originally there are at least 66% of white balls?

    ChatGPT

    Given that at least 66% of the balls in the bag are white, we want to determine the number of balls (kkk) you need to draw to ensure a majority of white balls in your hand with high probability.

    Approach

    To achieve a majority, more than half of the kkk drawn balls need to be white. For kkk draws, this means having more than k2\frac{k}{2}2k white balls. We can use the normal approximation to the binomial distribution for the calculation, given the proportions and desired confidence level.

    Let’s denote:

    • pW=0.66p_W = 0.66pW=0.66 as the proportion of white balls.
    • pB=0.34p_B = 0.34pB=0.34 as the proportion of black balls.

    We want the probability P(X>k2)P(X > \frac{k}{2})P(X>2k) to be very high (close to 1), where XXX is the number of white balls drawn out of kkk.

    Calculation

    1. Mean and Variance:
      For a binomial distribution X∼Binomial(k,pW)X \sim \text{Binomial}(k, p_W)X∼Binomial(k,pW):
      • The mean μ=kpW

        μ=kpW\mu = kp_W

      • The variance σ2=kpWpB

        σ2=kpWpB\sigma^2 = kp_Wp_B

      • The standard deviation σ=kpWpB

        σ=kpWpB\sigma = \sqrt{kp_Wp_B}

    1. Normal Approximation:
      For large kkk, the binomial distribution can be approximated by a normal distribution N(μ,σ2)N(\mu, \sigma^2)N(μ,σ2).
    1. Determine kkk:
      We need to find kkk such that the probability P(X>k2)P(X > \frac{k}{2})P(X>2k) is very high. Using the normal approximation,X∼N(μ,σ2)μ=0.66kσ=0.66k×0.34=0.2244kP(X>2k)≈P(Z>0.2244k2k−0.66k)

      X∼N(μ,σ2)X \sim N(\mu, \sigma^2)

      μ=0.66k\mu = 0.66k

      σ=0.66k×0.34=0.2244k\sigma = \sqrt{0.66k \times 0.34} = \sqrt{0.2244k}

      To ensure a majority of white balls:

      P(X>k2)≈P(Z>k2−0.66k0.2244k)P\left( X > \frac{k}{2} \right) \approx P\left( Z > \frac{\frac{k}{2} - 0.66k}{\sqrt{0.2244k}} \right)

      where ZZZ is the standard normal variable.

      We want this probability to be at least 95%, corresponding to a z-score of approximately 1.645.

    1. Setting up the Inequality:0.2244k2k−0.66k≥1.6450.2244k0.5k−0.66k≥1.6450.2244k−0.16k≥1.645−0.16k≥1.6450.2244−0.16k≥1.645×0.4741−0.16k≥0.7797k≤0.160.7797k≤4.873k≤4.8732k≤23.74

      k2−0.66k0.2244k≥1.645\frac{\frac{k}{2} - 0.66k}{\sqrt{0.2244k}} \geq 1.645

      Simplifying,

      0.5k−0.66k0.2244k≥1.645\frac{0.5k - 0.66k}{\sqrt{0.2244k}} \geq 1.645

      −0.16k0.2244k≥1.645\frac{-0.16k}{\sqrt{0.2244k}} \geq 1.645

      −0.16k≥1.6450.2244-0.16 \sqrt{k} \geq 1.645 \sqrt{0.2244}

      −0.16k≥1.645×0.4741-0.16 \sqrt{k} \geq 1.645 \times 0.4741

      −0.16k≥0.7797-0.16 \sqrt{k} \geq 0.7797

      k≤0.77970.16\sqrt{k} \leq \frac{0.7797}{0.16}

      k≤4.873\sqrt{k} \leq 4.873

      k≤4.8732k \leq 4.873^2

      k≤23.74k \leq 23.74

    Therefore, you need to draw at least 24 balls to ensure, with high probability, that you have a majority of white balls in your hand, given that the proportion of white balls is at least 66%.


Documentation updates

Update our docs - list which allocators we recognise, document the process for allocators to apply